11 research outputs found
Advances in generative modelling: from component analysis to generative adversarial networks
This Thesis revolves around datasets and algorithms, with a focus on generative modelling. In particular, we first turn our attention to a novel, multi-attribute, 2D facial dataset. We then present deterministic as well as probabilistic Component Analysis (CA) techniques which can be applied to multi-attribute 2D as well as 3D data. We finally present deep learning generative approaches specially designed to manipulate 3D facial data.
Most 2D facial datasets that are available in the literature, are: a) automatically or semi-automatically collected and thus contain noisy labels, hindering the benchmarking and comparisons between algorithms. Moreover, they are not annotated for multiple attributes. In the first part of the Thesis, we present the first manually collected and annotated database, which contains labels for multiple attributes. As we demonstrate in a series of experiments, it can be used in a number of applications ranging from image translation to age-invariant face recognition.
Moving on, we turn our attention to CA methodologies. CA approaches, although being able to only capture linear relationships between data, can still be proven to be efficient in data such as UV maps or 3D data registered in a common template, since they are well aligned. The introduction of more complex datasets in the literature, which contain labels for multiple attributes, naturally brought the need for novel algorithms that can simultaneously handle multiple attributes. In this Thesis, we cover novel CA approaches which are specifically designed to be utilised in datasets annotated with respect to multiple attributes and can be used in a variety of tasks, such as 2D image denoising and translation, as well as 3D data generation and identification.
Nevertheless, while CA methods are indeed efficient when handling registered 3D facial data, linear 3D generative models lack details when it comes to reconstructing or generating finer facial characteristics. To alleviate this, in the final part of this Thesis we propose a novel generative framework harnessing the power of Generative Adversarial Networks.Open Acces
AgeDB: the first manually collected, in-the-wild age database
Over the last few years, increased interest has arisen with respect to age-related tasks in the Computer Vision community. As a result, several "in-the-wild" databases annotated with respect to the age attribute became available in the literature. Nevertheless, one major drawback of these databases is that they are semi-automatically collected and annotated and thus they contain noisy labels. Therefore, the algorithms that are evaluated in such databases are prone to noisy estimates. In order to overcome such drawbacks, we present in this paper the first, to the best of knowledge, manually collected "in-the-wild" age database, dubbed AgeDB, containing images annotated with accurate to the year, noise-free labels. As demonstrated by a series of experiments utilizing state-of-the-art algorithms, this unique property renders AgeDB suitable when performing experiments on age-invariant face verification, age estimation and face age progression "in-the-wild"
Towards a complete 3D morphable model of the human head
Three-dimensional Morphable Models (3DMMs) are powerful statistical tools for
representing the 3D shapes and textures of an object class. Here we present the
most complete 3DMM of the human head to date that includes face, cranium, ears,
eyes, teeth and tongue. To achieve this, we propose two methods for combining
existing 3DMMs of different overlapping head parts: i. use a regressor to
complete missing parts of one model using the other, ii. use the Gaussian
Process framework to blend covariance matrices from multiple models. Thus we
build a new combined face-and-head shape model that blends the variability and
facial detail of an existing face model (the LSFM) with the full head modelling
capability of an existing head model (the LYHM). Then we construct and fuse a
highly-detailed ear model to extend the variation of the ear shape. Eye and eye
region models are incorporated into the head model, along with basic models of
the teeth, tongue and inner mouth cavity. The new model achieves
state-of-the-art performance. We use our model to reconstruct full head
representations from single, unconstrained images allowing us to parameterize
craniofacial shape and texture, along with the ear shape, eye gaze and eye
color.Comment: 18 pages, 18 figures, submitted to Transactions on Pattern Analysis
and Machine Intelligence (TPAMI) on the 9th of October as an extension paper
of the original oral CVPR paper : arXiv:1903.0378
Multi-Attribute Robust Component Analysis for Facial UV Maps
The collection of large-scale three-dimensional (3-D) face models has led to significant progress in the field of 3-D face alignment “in-the-wild,” with several methods being proposed toward establishing sparse or dense 3-D correspondences between a given 2-D facial image and a 3-D face model. Utilizing 3-D face alignment improves 2-D face alignment in many ways, such as alleviating issues with artifacts and warping effects in texture images. However, the utilization of 3-D face models introduces a new set of challenges for researchers. Since facial images are commonly captured in arbitrary recording conditions, a considerable amount of missing information and gross outliers is observed (e.g., due to self-occlusion, subjects wearing eye-glasses, and so on). To this end, in this paper we propose the Multi-Attribute Robust Component Analysis (MA-RCA), a novel technique that is suitable for facial UV maps containing a considerable amount of missing information and outliers, while additionally, elegantly incorporates knowledge from various available attributes, such as age and identity. We evaluate the proposed method on problems such as UV denoising, UV completion, facial expression synthesis, and age progression, where MA-RCA outperforms compared techniques
Relightify: Relightable 3D Faces from a Single Image via Diffusion Models
Following the remarkable success of diffusion models on image generation,
recent works have also demonstrated their impressive ability to address a
number of inverse problems in an unsupervised way, by properly constraining the
sampling process based on a conditioning input. Motivated by this, in this
paper, we present the first approach to use diffusion models as a prior for
highly accurate 3D facial BRDF reconstruction from a single image. We start by
leveraging a high-quality UV dataset of facial reflectance (diffuse and
specular albedo and normals), which we render under varying illumination
settings to simulate natural RGB textures and, then, train an unconditional
diffusion model on concatenated pairs of rendered textures and reflectance
components. At test time, we fit a 3D morphable model to the given image and
unwrap the face in a partial UV texture. By sampling from the diffusion model,
while retaining the observed texture part intact, the model inpaints not only
the self-occluded areas but also the unknown reflectance components, in a
single sequence of denoising steps. In contrast to existing methods, we
directly acquire the observed texture from the input image, thus, resulting in
more faithful and consistent reflectance estimation. Through a series of
qualitative and quantitative comparisons, we demonstrate superior performance
in both texture completion as well as reflectance reconstruction tasks.Comment: 14 pages, 12 figures. Project page:
https://foivospar.github.io/Relightify
FitMe: Deep Photorealistic 3D Morphable Model Avatars
In this paper, we introduce FitMe, a facial reflectance model and a
differentiable rendering optimization pipeline, that can be used to acquire
high-fidelity renderable human avatars from single or multiple images. The
model consists of a multi-modal style-based generator, that captures facial
appearance in terms of diffuse and specular reflectance, and a PCA-based shape
model. We employ a fast differentiable rendering process that can be used in an
optimization pipeline, while also achieving photorealistic facial shading. Our
optimization process accurately captures both the facial reflectance and shape
in high-detail, by exploiting the expressivity of the style-based latent
representation and of our shape model. FitMe achieves state-of-the-art
reflectance acquisition and identity preservation on single "in-the-wild"
facial images, while it produces impressive scan-like results, when given
multiple unconstrained facial images pertaining to the same identity. In
contrast with recent implicit avatar reconstructions, FitMe requires only one
minute and produces relightable mesh and texture-based avatars, that can be
used by end-user applications.Comment: Accepted at CVPR 2023, project page at https://lattas.github.io/fitme
, 17 pages including supplementary materia